Exploring Expressive Speech Space in an Audio-book

نویسندگان

  • Lijuan Wang
  • Yong Zhao
  • Min Chu
  • Yining Chen
  • Frank Soong
  • Zhigang Cao
چکیده

In this paper, an audio-book, in which a professional voice talent performs multiple characters, is exploited to investigate the expressiveness of speech. The expressive speech space of the sole speaker is explored by finding the distances between acoustic models of multiple characters and the perceived proximity between their speech utterances. Using the speech of ten characters as test data, the character confusion is evaluated in both acoustic space and perceptual space. We find that the average precision to differentiate one character from the others is 81.7% in the acoustic space and 72.6% in the perceptual space. It is interesting that the objective measure outperforms the subjective measure. Furthermore, the acoustic distance measured by normalized Kullback-Leibler divergence (NKLD) between two characters is highly correlated with the perceptual distance. The correlation coefficient is 0.814. Therefore, NKLD can measure the perceptual similarity between groups of utterances objectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Rich Expressive Information from Audiobook Data Using Cluster Adaptive Training

Audiobook data is a freely available source of rich expressive speech data. To accurately generate speech of this form, expressiveness must be incorporated into the synthesis system. This paper investigates two parts of this process: the representation of expressive information in a statistical parametric speech synthesis system; and whether discrete expressive state labels can sufficiently rep...

متن کامل

Exploring EFL Learners’ Use of Formulaic Sequences in Pragmatically Focused Role-play Tasks

Communicative language use largely entails regular patterns consisting of pre-constructed phrases or sequences. These sequences have been examined by many researchers to find the situation-based formulas which may help L2 learners follow a possibly more target-like speaking system. This study, therefore, explored two categories of formulaic expressions including speech formulas and situation-bo...

متن کامل

Acoustic and Visual Analysis of Expressive Speech: A Case Study of French Acted Speech

Within the framework of developing an expressive audiovisual speech synthesis, an acoustic and visual analysis of expressive acted speech is proposed in this paper. Our purpose is to identify the main characteristics of audiovisual expressions that need to be integrated during synthesis to provide believable emotions to the virtual 3D talking head. We conducted a case study of a semi-profession...

متن کامل

Acoustic quality assessment at Nezamol molk dome of Jame mosque of Isfahan

 Incontrovertibly, the sense of hearing is one of the five most substantial human senses. In fact, the human ear receives sound and transmits to the human brain by the auditory organs. Hence, sound can be considered as one of the key tools of human communication with each other and the environment around them. Since acoustic has a profound impact on the body, soul, and the performance of human ...

متن کامل

The ILSP Text - to - Speech System for the Blizzard Challenge 2012

This paper describes ILSP and INNOETICS Speech Synthesis System entry for the Blizzard Challenge 2012. A description of the underlying system and techniques used are provided, as well as information about the voice building process and discussion on the obtained evaluation results. Additional focus will be given to new processes or techniques we used this year in comparison to our previous part...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005